Search results for "Web bot"

showing 10 items of 11 documents

Time series clustering with different distance measures to tell Web bots and humans apart

2022

The paper deals with the problem of differentiating Web sessions of bots and human users by observing some characteristics of their traffic at the Web server input. We propose an approach to cluster bots’ and humans’ sessions represented as time series. First, sessions are expressed as sequences of HTTP requests coming to the server at specific timestamps; then, they are pre-preprocessed to form time series of limited length. Time series are clustered and the clustering performance is evaluated in terms of the ability to partition bots and humans into separate clusters. The proposed approach is applied to real server log data and validated with the use of different time series distance meas…

Web sessionTime seriesUnsupervised classificationWeb bot detectionInternet robotSimilarity measureWeb botClusteringDistance measureECMS 2022 Proceedings edited by Ibrahim A. Hameed, Agus Hasan, Saleh Abdel-Afou Alaliyat

researchProduct

Bot or not? a case study on bot recognition from web session logs

2018

This work reports on a study of web usage logs to verify whether it is possible to achieve good recognition rates in the task of distinguishing between human users and automated bots using computational intelligence techniques. Two problem statements are given, offline (for completed sessions) and on-line (for sequences of individual HTTP requests). The former is solved with several standard computational intelligence tools. For the second, a learning version of Wald’s sequential probability ratio test is used.

Sequential decisionComputer sciencebusiness.industryProblem statementComputational intelligence02 engineering and technologyMachine learningcomputer.software_genreSequential decisionClassificationSession (web analytics)Task (project management)Work (electrical)020204 information systemsSequential probability ratio test0202 electrical engineering electronic engineering information engineering020201 artificial intelligence & image processingWeb usageArtificial intelligencebusinessClassification; Sequential decision; Web bot recognitioncomputerWeb bot recognition

researchProduct

Improving clustering of Web bot and human sessions by applying Principal Component Analysis

2019

View references (18) The paper addresses the problem of modeling Web sessions of bots and legitimate users (humans) as feature vectors for their use at the input of classification models. So far many different features to discriminate bots’ and humans’ navigational patterns have been considered in session models but very few studies were devoted to feature selection and dimensionality reduction in the context of bot detection. We propose applying Principal Component Analysis (PCA) to develop improved session models based on predictor variables being efficient discriminants of Web bots. The proposed models are used in session clustering, whose performance is evaluated in terms of the purity …

Bot detectionPrincipal Component AnalysisPCALog analysisComputer sciencek-meansInternet robotcomputer.software_genreClassificationWeb botDimensionality reductionClusteringWeb serverPrincipal component analysisFeature selectionData miningCluster analysiscomputerCommunications of the ECMS

researchProduct

Modeling a non-stationary bots’ arrival process at an e-commerce Web site

2017

Abstract The paper concerns the issue of modeling and generating a representative Web workload for Web server performance evaluation through simulation experiments. Web traffic analysis has been done from two decades, usually based on Web server log data. However, while the character of the overall Web traffic has been extensively studied and modeled, relatively few studies have been devoted to the analysis of Web traffic generated by Internet robots (Web bots). Moreover, the overwhelming majority of studies concern the traffic on non e-commerce websites. In this paper we address the problem of modeling a realistic arrival process of bots’ requests on an e-commerce Web server. Based on real…

Web serverGeneral Computer ScienceComputer scienceInternet robotReal-time computing02 engineering and technologyE-commercecomputer.software_genreSession (web analytics)Theoretical Computer ScienceWeb traffic characterizationWeb serverWeb traffic0202 electrical engineering electronic engineering information engineeringTraffic generation modelWeb traffic analysis and modelingbusiness.industryComputerSystemsOrganization_COMPUTER-COMMUNICATIONNETWORKS020206 networking & telecommunicationsWeb botHeavy-tailed distributionModeling and SimulationHeavy-tailed distribution020201 artificial intelligence & image processingThe InternetWeb log analysis softwareLog file analysisData miningbusinessRegression analysiscomputerJournal of Computational Science

researchProduct

Modeling a session-based bots' arrival process at a Web server

2017

analysis and modelinguser sessionregresion analysisWeb serverInternet robotlog fileWeb trafficWeb workloadWeb bot

researchProduct

Bot or Not? A Case Study on Bot Recognition from Web Session Logs

2019

Sequential decisionClassificationWeb bot recognition

researchProduct

Online Web Bot Detection Using a Sequential Classification Approach

2019

A significant problem nowadays is detection of Web traffic generated by automatic software agents (Web bots). Some studies have dealt with this task by proposing various approaches to Web traffic classification in order to distinguish the traffic stemming from human users' visits from that generated by bots. Most of previous works addressed the problem of offline bot recognition, based on available information on user sessions completed on a Web server. Very few approaches, however, have been proposed to recognize bots online, before the session completes. This paper proposes a novel approach to binary classification of a multivariate data stream incoming on a Web server, in order to recogn…

Web serverHTTP request analysis; Internet security; Machine learning; Neural networks; Sequential classification; Web bot detectionSettore INF/01 - InformaticaWeb bot detectionComputer sciencebusiness.industrySequential classification020206 networking & telecommunications02 engineering and technologyMachine learningcomputer.software_genreInternet securitySession (web analytics)Task (computing)Web trafficMachine learning0202 electrical engineering electronic engineering information engineeringHTTP request analysis020201 artificial intelligence & image processingArtificial intelligencebusinesscomputerNeural networksInternet security2018 IEEE 20th International Conference on High Performance Computing and Communications; IEEE 16th International Conference on Smart City; IEEE 4th International Conference on Data Science and Systems (HPCC/SmartCity/DSS)

researchProduct

Bot recognition in a Web store: An approach based on unsupervised learning

2020

Abstract Web traffic on e-business sites is increasingly dominated by artificial agents (Web bots) which pose a threat to the website security, privacy, and performance. To develop efficient bot detection methods and discover reliable e-customer behavioural patterns, the accurate separation of traffic generated by legitimate users and Web bots is necessary. This paper proposes a machine learning solution to the problem of bot and human session classification, with a specific application to e-commerce. The approach studied in this work explores the use of unsupervised learning (k-means and Graded Possibilistic c-Means), followed by supervised labelling of clusters, a generative learning stra…

Unsupervised classificationWeb bot detectionComputer Networks and CommunicationsComputer scienceInternet robot02 engineering and technologyMachine learningcomputer.software_genreWeb trafficWeb serverMachine learning0202 electrical engineering electronic engineering information engineeringArtificial neural networkbusiness.industrySupervised learning020206 networking & telecommunicationsPerceptronWeb application securityWeb botComputer Science ApplicationsSupport vector machineGenerative modelComputingMethodologies_PATTERNRECOGNITIONHardware and ArchitectureSupervised classificationUnsupervised learning020201 artificial intelligence & image processingArtificial intelligencebusinesscomputer

researchProduct

Identifying legitimate Web users and bots with different traffic profiles — an Information Bottleneck approach

2020

Abstract Recent studies reported that about half of Web users nowadays are intelligent agents (Web bots). Many bots are impersonators operating at a very high sophistication level, trying to emulate navigational behaviors of legitimate users (humans). Moreover, bot technology continues to evolve which makes bot detection even harder. To deal with this problem, many advanced methods for differentiating bots from humans have been proposed, a large part of which relies on supervised machine learning techniques. In this paper, we propose a novel approach to identify various profiles of bots and humans which combines feature selection and unsupervised learning of HTTP-level traffic patterns to d…

Web userInformation Systems and ManagementComputer scienceInternet robotFeature selection02 engineering and technologyMachine learningcomputer.software_genreUnsupervised learningSession (web analytics)Management Information SystemsIntelligent agentArtificial Intelligence020204 information systemsMachine learning0202 electrical engineering electronic engineering information engineeringCluster analysisBot detectionbusiness.industryInformation bottleneck methodWeb botServer logHierarchical clusteringUnsupervised learning020201 artificial intelligence & image processingArtificial intelligencebusinesscomputerSoftwareKnowledge-Based Systems

researchProduct

Efficiency Analysis Of Resource Request Patterns In Classification Of Web Robots And Humans

2018

The paper deals with the problem of classification of Web traffic generated by robots and humans on e-commerce websites. Due to the still growing proliferation and specialization of bots, a large body of research into characterization and recognition of their traffic has been conducted so far. In particular, some approaches to classify bot and human sessions on websites have been proposed in the literature. In this paper we verify and discuss the efficiency of such recently proposed approach, which uses differences in resource request patterns of bots and humans. We reconstructed Web sessions from actual HTTP log data for three different e-commerce sites, varying in the traffic intensity an…

HTTP TrafficWeb TrafficWeb CrawlerInternet RobotWeb ServerWeb BotClassificationCommunications of the ECMS

researchProduct